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SECTION ONE 

REPORT ON SOME PRINCIPLES OF THE UNIFIED TRANSFER SYSTEM (UTS) 

By 

Ariadne Lukjanow 
OE-I-R, INC. 

I . INTRODUCTION 

Several approaches have been employed in Machine Translation in the course 
of the past few yeari=s . These approaches were either determined by specific 
objectives or influenced by the background of the research workers. The ob- 
jectives range from automatic dictionaries to translations with varying degrees 
of accuracy, readability, and perfection. The background of a researcher can 
influence his approach to Machine Translation in three basic ways. One approach 
may be influenced by machines in such a way that only the development of a new 
language computer would lead to acceptable results. Another approach may consist 
of an attempt to simulate human reasoning on a standard computer. 

A third approach would be to make Machine Translation as mechanical and 
utilitarian as possible, by adapting this attempt to the capabilities of the 
machine and by clearly defining the relationship between man and machine. Since 
present-day computers are best suited to repetitive mathematical operations and 
man is still the best thinker, this last approach will make it possible to 
utilize both of these capabilities to their fullest extent. All thinking will 
be expressed in the form of codes in the dictionary in the manner provided for 
by the system. 

In order to translate at all, any system must provide solutions to the 
problem of transferring structure, function, form and meaning from the source 
language into the target language. Thus, we can call translation a fourfold 

transfer process consisting of: 
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(1) Transfer of the function of words (parts of speech) 

(2) Transfer of the form of words (morphology) 

(3) Transfer of the meaning of words (semantics) 

(4) Transfer of the location of words (syntax) 

Every word has a meaning, even if there occurs a so-called ’’zero- 
translation," or non-translation. In this system, we shall accept a 1:1 
translation as equivalent to no-meaning problem. 

Every word in a language has its function; i.e., it is a part of speech 
and, unless it is a non-t pans lat ion item, it also has a location or position 
(syntax) qualification. Transfer process can be visualized as a combination of 
the following six concepts: 

(1) Function (some "particles," some adverbs) 

(2) Function + location (some punctuation marks, some adverbs, 
some gerunds) 

(3) Function + form + location (groups from all parts of speech) 

(4) Function + form (some prepositions, some adverbs, some gerunds, 
negat ions , etc . ) 

(5) Function + form + meaning + location (groups from every part of 
speech) 

(6) Function + meaning + location (some adverbs, some conjunctions, 

etc . ) 

Example : 

Combination of function and location: 

posle - later; adverb with a 1:1 translation equivalent 
and location "after verb." 

Colon, punctuation mark.- 1:1 equivalent, position: is at 
the end of a clause. 
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It is obvious that the elements of the transfer form sets with variants 
in each of the elements. We can visualize them as follows: 


Function 

Form 

Meaning 

Location 

X 

0 

0 

0 

0 

X 

0 

0 

0 

0 

X 

0 

0 

0 

0 

X 

X 

X 

0 

0 

X 

0 

X 

0 

X 

0 

0 

X 

0 

X 

X 

0 

0 

X 

0 

X 

0 

0 

X 

X 

X 

X 

X 

0 

X 

0 

X 

X 

0 

X 

X 

X 

X 

X 

0 

X 


0 - non-variant or absent 
x - variant 

It would seem that these variations could be expressed in mathematical 
formulae, but this is not true because the relationship between the variants 
does not follow the rules of permutation or random combinations. In contrast, 
these variations follow definite linguistic rules which permit only certain 
variants within certain combinations. In order to determine these linguistic 
combinations for the elements of transfer, it is necessary to define and 
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classify each variant for every element of transfer, as well as the relationship 
between the variants of each element of the transfer to the variants of the other 
three . 

This can best be illustrated on prepositions; 


ELEMENT OF 


TRANSFER 

DEFINITION 

function 

preposition 

form 

case government; i.e., pre- 
positions demanding the 
genitive, dative, accusative, 
instrumental, or locative 

meaning 

prepositions of time (static, 
earlier, later), location or 
space (where, to where, from 
where), cause, goal, substi- 
tution, division, etc. 

location 

first item in prepositional 
phrase, or position 1 in pre- 
positional phrase 


Theoretically, we could produce a transfer combination of preposition + 
dative + location (from where) + position 1 of prepositional phrase, but the 
grammatical rules and semantic connotations do not permit this type of com- 
bination. The prepositions of location are subject to the following division 
only: 
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location 

GENITIVE 

DATIVE 

ACCUSATIVE 

INSTRUMENTAL 

LOCATIVE 

a) where? 

bliz 

vne 

me^hdu 

sredi 

u 

po 


za 

mezhdu 

nad 

pered 

pod 

V 

na 

pri 

b) where 
to? 

do 

k 

V 

za 

na 

pod 

skvoz6 

cherez 


o 

c) from 

where? 

iz 

iz-za 

iz-pod 

ot 

s 



- 



The above table shows that the "from where?" definition is used 
only with the genitive case. Thus, the only usable and meaningful combination 


is: 

preposition + genitive + location (from where?) + 
first position of prepositional phrase 

In the UTS we accept any meaningful and valid combination of elements of transfer 
expressed in the form of numerical digits as a single unified transfer code. 

Since many words of the source language can be associated with 
several function, form, meaning, and location qualifications, it is necessary 
to combine single transfer code units into sets of codes which can express 

these variations. 

Examples : 

dannye nominal 

modifier 

vdol6 preposition of genitive 

adverb 

s preposition of - genitive 
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sredi preposition of location (where?) 

time (static) 

If we consider that we have four elements of transfer, each of which has 
a definite and limited number of variants, it is safe to assume that the number 
of transfer codes is limited and that we may likewise assume that the same 
applies to sets of transfer codes . This leads us to the concept that numerous 
words in the dictionary are associated with identical transfer codes or identical 
sets of transfer codes. This fact makes possible the concept of code patterns. 
The number of single transfer code units in the pattern can vary from one to 
several. After examining some 50,000 canonical entries (stems) in the dictionary 
of Smirnitskij, we have decided to set the limit at a maximum of 25 single code 
units in the pattern. 

Now let us examine the actual elements of each transfer. Since in 
translation we are dealing with at least two languages simultaneously, we have 
to develop a criterion for parts of speech, morphology, semantics, and syntax 
which would accommodate both languages under consideration, or we must establish 
a classification system which in form of transfer codes would permit us to place 
an equal sign between the two languages. This necessitates a certain type of 
analysis and of synthesis of the grammars of both languages. 
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II, THE FUNCTION OF WORDS OR THE CATEGORIZATION OF WORD BEHAVIOR 

When examining conventional parts of speech in Russian and English grammars 
separately, we note that they contain identical categories such as prepositions, 
adverbs, nominals , modifiers, etc. But when we compare these categories of both 
languages, we discover that they differ considerably in usage, behavior, and 
function. In terms of a translation system, this means that either we have to 
introduce new synthetic categories or we have to divide and redistribute words 
differently within these categories. Categorizing is, of course, a somewhat 
subjective process. That can best be illustrated by examining the Englishpre- 
position "to," in the following manner: 


QUALIFICATION 

ENGLISH 

RUSSIAN 

EQUIVALENTS 

BILINGUAL 

DATA 

TRANSFER DATA 
(CLASSIFICATION) 

Function 

1 . prepo- 
sition 

1. preposition 

1. prepo- 
sition - 
like item 

1. preposition 
code 

Behavior 

2 . intro- 
ducer of 
infini- 
tive 

2 . non- 
existent 

2. particle - 
like item 

2. particle 
code 


Obviously, the second category in the above table might as well be 
classified as a special auxiliary verb (instead of "particle"), but to the 
author of the system the definition as "particle" appears more reasonable, 
perhaps because of the occurrence of the Russian particle "by" in the verbal 
phrase . 

In the process of comparative analysis-synthesis, we have established the 
following basic categories as transfer parts of speech (listed alphabetically): 

(1) adjectival modifier 

(2) adjective/noun 

(3) adverb (incl. some gerunds and the particle li) 
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(4) 

adverbial modifier (type: bolee, menee, etc. 

) 

(5) 

Auxiliary verb (byl, byli, etc.) 


(6) 

auxiliary verb (moch6, khotet6, etc.) 


(7) 

conjunction 


(8) 

negation (incl. some negative adverbs) 


(9) 

nominal (animate) , incl . some pronouns 


(10) 

nominal (inanimate) , incl . some pronouns and 

numerals . 

(11) 

nominal (formulae, cardinal numbers, missing 

words) 

(12) 

numerical modifier 


(13) 

particle 


(14) 

participal modifier 


(15) 

preposition 


(16) 

pronominal modifier 


(17) 

pronoun (type: nami, vami, imi , etc.) 


(18) 

pronoun (soboj) 


(19) 

punctuation marks (each treated as a separate 
category, a total of six) 

(20) 

verb (including participles such as izucheny, 

totkrytyj etc.) 


The assignment of these basic categories to individual words is a discrete 
and subjective process. It can give valid results only if all other factors and 
constituent parts of transfer are being taken into consideration. We proceed 
from the parts of speech as categories to their classification. That can be 
expressed in the form of a numeric code. 

We know that sentences and phrases are combinations of these categories 
and that these combinations cannot be produced by random distribution of words. 
Words have to occupy certain positions in order to form a meaningful combination 
or phrase . 
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If we take the three-word phrase "in i-hiB „ - 

yuiase in in is room, we cannot convey the 

same idea by a redistribution of the participating words: 

"this in room" 

"this room in" 

"room in this" 

"room this in" 

"in room this" 

We will either get a meaningless jumble of words or convey a different 
idea. We say "our new building," but not "new our building." We place some 
adverbs before verbs, some after them. Some of these phenomena can be explained, 
some are ascribed to usage, but others escape any logical explanation. 

Dealing with 26 categories and considering each of them in relation to 
the other 25, we can establish a hierarchy within the meaningful combinations 
of parts of speech; i .e . , logical sequences. 

This point can be illustrated by the position of words within the sequence 

of a prepositional phrase consisting of a preposition (P) , a nominal (N) , two 

adjectival modifiers (AM), and a pronominal modifier (PM): 

P before N 
AM before N 
PM before N 
PM before AN 
P before PM 
P before AM 
AM = AM 

Thus, we arrive at P-PM-AM-AM-N ; or if we assign numerical values to 
these categories and would like them to form a progression of il 12 i3, etc., 
we will emerge with the following correlations: 

P<N; AM<N; PM<N; PM<cAM; P<PM; PcAM; AM- AM; e^g., Pc PM AM=AMcN . 

Approaching our categories of parts of speech with these criteria, we can assign 
numerical values or codes to parts of speech (all codes are in octal notation): 
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01 

comma 

02 

conjunction 

16 

preposition 

17 

adverb 

20 

negation 

21 

participial modifier 

22 

pronoun (nami, vami , etc.) 

23 

auxiliary verb (byl, bylo, etc.) 

24 

auxiliary verb (moch6, khotet6, etc.) 

25 

particle 

26 

verb 

27 

pronoun (soboj) 

37 

adverbial modifier 

45 

pronominal modifier 

46 

numerical modifier 

47 

adjectival modifier 

55 

adjective/noun 

65 

nominal (animate) 

66 

nominal (inanimate) 

67 

nominal (formulae, numbers, missing words) 

70-77 

punctuation marks (colon, semicolon, dash, 
period, etc.) 

We are fully aware that 

this progression method for the identification - 

a phrase or logical sequence is 

reliable only in so-called normal sequences. 

Interrupted sequences or inverted word order require additional re-examination 

and even actual recognition of 

constituent parts of sequences. In such cases 


specific instructions are necessary. 
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We have, however, established the fact that more than 80%* of sequences 
are so-called normal sequences. That frees us of the necessity to recognize 
at all times every constituent part of all sequences, as well as of every 
possible combination of the constituent part. 

The sequences established through progression codes are by no means 
permanent or final divisions within the sentence. They can become smaller or 
disintegrate into single items through either the demands of other components 
of the transfer process, or through so-called verification instructions. 

Example: 

A sequence ending with code 47 (adjectival modifier) will 
call for verification instructions of: 

(1) a sequence within a sequence; 

(2) a sequence with homogeneous parts of speech plus 
conjunction and/or comma; etc. 

We can therefore state that progression codes divide sentences into 
working units which may or may not become final sequences or phrases . This 
once more confirms the idea of a total or unified transfer versus a single 
transfer concept on a different level within the limitations of each phase 
(structural, morphological, semantic). 

The division into sequence is made in accordance with: 

. . . A n = part of speed code 

B 1 • • • B n = P art of speech code with 1st digit being 6 
C x ... C n = part of speech code with value of 10 

An 

B n even if = B n 
C 1 ^2 A n Ax = A n 




STOI 


B, 


* Between 20,000 and 25,000 words in various fields of knowledge have been 
examined for this purpose . 
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From: JOURNAL OF CHEMICAL INDUSTRY, vol . 22, no. 9 (1952) 

Izucheny reaktsii mezhdu ehtilovym ehfirom pirokatekhinfosforistoj 
kisloty i triarilbrommetanami . 

Pri vzaimodejstvii ukazannykh soedinenij obrazuiutsia pirokatekhinovye 
efiry triarilmetilfosf inovykh kislot . 

Pri omylenii poslednlkh slaboj solianoj kislotoj polucheny pirokatekhin 
i triarilmetilfosf inovye kisloty. 

V nastoiascem issledovanii nami izuchalisS reaktsii mezhdu smeshannymi 
ehfirami fosforistoj kisloty, tipa A g i triarilbrorametanami. 

Reaktsiia mezhdu ehtilpirokatekhinovym ehfirom fosforistoj kisloty 
i triarilbrommetanami po analogii s alkilfosforistymi ehfirami dolzhna idti 
po reakt s i i : A 0 ... . 

a 

EhksperimentalGnye dannye pokazali, chto reaktsiia dejstvitel6no 
protekaet po ukazannpmu uravneniiu. 

Tak, naprimer, pri nagrevanii smesi triarilbrommetana i 
ehtilpirokatekhinovogo ehfira fosforistoj kisloty proiskhodit vydelenie 
bromistogo ehtila i obrazovanie kristallicheskogo vescestva predstavliaiuseego 
soboj pirokatekhinovyj ehfir triarilmetilfosfinovoj kisloty. 

Dlia ust^povlenila stroeniia poluehennogo soedineniia byla provedena 
reaktsiia omyleniia razbavlennoj solianoj kislotoj pri nagrevanii ot 180 
do 200° v zapaiannykh trubkakh. 

Produktom omyleniia iavliaiutsia pirokatekhin i triarilmetilfosf inovaia 
kislot a , 

Poluchennye nam! ehfiry tlpa A 2 ves6ma uatojchivy k vlage vozdukha. 
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Concerning the notion of trlarylbromomethanes on alkylpyrocatechol 
esters of phosphorous acid. 

Reactions between the ethyl ester of pyroeatechol-phosphorous acid and 
trlarylbromomethanes were studied. 

(Up) on the interaction of the above-mentioned compounds, pyrocatechol 
esters of triarylmethylphosphinic acids are formed. 

(Up) on hydrolysis of the latter with dilute hydrochloric acid, pyrocatechol 
and triarylmethylphosphinic acids were obtained. 

In the present investigation, the reactions between mixed esters of 
phosphorous acid of the type ... and trlarylbromomethanes were studied /by us?. 

The reaction between the ethylpyrocatechol ester of phosphorous acid and 
trlarylbromomethanes should proceed, by analogy with alfcrlphosphorous esters, 
according to the reaction: 

Experimental data showed that the reaction actually proceeds according to 
the above-mentioned equation. 

Thus, for example, upon heating of a mixture of triarylbromomethane and 
the ethylpyrocatechol ester of phosphorous acid, evolution of ethyl bromide 
occurs and (there occurs) the formation of a crystalline substance which Is the 
pyrocatechol ester of triarylmethylphosphinic acid. 

In order to establish the structure of the compound obtained, a (reaction of) 
hydrolysis with dilute hydrochloric acid was carried out on heating (at) from 
180 to 200° in sealed tubes. 

The product (s) of hydrolysis are pyrocatechol and triarylmethylphosphinic 

acid . 

The esters obtained by us of the type ... are extremely resistant to the 
moisture of the air. 
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Examining the preceding Russian text sample in terms of the progression 
code, it appears as follows: 



It has heen found convenient to make the part of speech code part of the 
pattern number, so that we can determine the possible logical sequence or 
wording area immediately after the dictionary look up. 
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III. FORM OF WORDS (MORPHOLOGY) 

With the part of speech codes, we have devised the means to divide 
the sentence into possible structural (constituents) , sequential (pro- 
gressive) , meaningful combinations, i.e., phrases or fractions of sentence 

The next step would be to establish in which way the constituents 
of the sequence depend on each other, and what demands they place on each 
other, if any (i.e., either to confirm the sequence or divide the original 
sequence into smaller sequences or even single items) . 

The morphological criteria we are using for this purpose are case, 
gender, number, and absence of these. For the sake of convenience, we 
define the demands, government, agreement, influence as "agreement in" 
case, gender, number. 

Numerical values used are: 

(1) Agreement in case # 1-7 

(2) Agreement in gender #1-3 

(3) Agreement in number #1-2 

(4) No agreement necessary 0 

Note: Case #7 represents the usage as per example in V 

riadu, sadu, na lugu, etc, 

V - prep. , acc. , locative 
riadu - nominal in dative 

Despite the disagreement in case, it is a meaning- 
ful combination in which words "belong together" 
or form a valid sequence. 

Positions of morphological 3-digit code are as follows: 

I T -7 T , 

case gender j number ? 

i i 
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Digits representing case agreement: 

0 - no case 

1 - nominative 

2 - genetive 

3 - dative 

4 - accusative 

5 - instrumental 

6 - locative 

7 - auxiliary 

Digits representing gender agreement: 

0 - no gender 

1 - masculine 

2 - feminine 

3 - neuter.. 

Digits representing number agreement: 

0 - singular, or no number 

1 - plural 

2 - number disagreement (used in impersonal 

verbs, etc.) 

All these morphological qualifications can occur singly or in 
combinations . 
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Table of possible morphological codes: 
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If We consider now the relationships possible between the concepts 
expressed In part of speech code and the units of morphological codes, we 
can establish combination sets of codes, i.e., the partial code patterns, 

In this report, we shall do so for one part of speech — the pre- 
positions . 


Prepositions, as we know, do not demand "agreement in" number or 
gender. Therefore, we are dealing with only a case agreement. 
Prepositions table: 


CASES 

ONE CASE PREP. 

2 CASE PREP. 

3 CASE 
PREP. 

2 CASE & 
AUXILIARY 

2 (genitive) 

bez, bliz, vdol ' , vmesto, 
vne, vnutri , vozle, vokrug, 
dlia, do, iz , iz-za, iz-pod, 
krome, mimo, nakanune , okolo, 
ot, posle , posredi, pjrotiv, 
radi , sredi, u 

mezhdu 

(mezh) 

s 


3 (dative) 

k, blagodaria, vopreki , 
podobno , soglasno, naperekor, 
navstrechu 


po 


4 (accusative) 

pro, skvoz ' , cherez 

v, na, za, 
pod, o (ob) 

s, po 

v , na 

5 (instr.) 

nad, pered 

za, pod, 
mezhdu (mezh) 

s 


6 (locative) 

pri 

v, na, o(ob) 

po 

v, na 

7 (auxiliary) 




v, na 
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On the basis of this table, we can say that some prepositions (code 
16 or 17) can be associated with one, two, or three morphological units. 
The total code patterns will be then as follows: 


ONE UNIT 
PATTERNS 

2 UNIT 
PATTERNS 

3 UNIT 
PATTERNS 

16 - 200 

16 - 200 

- 500 

16 - 200 

- 400 

- 500 



16 - 400 
- 600 


16 - 300 

- 400 

- 600 

16 - 300 

16 - 400 
- 500 


16 - 4Q0 

600 
- 700 

16 - 400 

17 - 200 
- 000 




17 - 300 

- 000 


16 - 500 


17 - 400 

- 000 

16 - 600 


In this fashion, the 49 prepositions of the prepositions table are 
associated with 14 code patterns which would accomplish function and form 
transfer . 
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IV. MEANING OF WORDS AND MEANING CLASSES 

The analysis of languages in establishing meaning categories is of 
subjective character and is based on a mental process, which not only requires 
an intimate knowledge of the languages to be analyzed but also a very care- 
ful manipulation of the numbering system in order to prevent an unintentional 
conflict of meaning classes in the code patterns within sequences. 

In this report, we shall attempt to establish some of the criteria of 
analysis and the nature of classification of semantic or meaning definitions. 

In this area we have to make a distinction between situations which 
qan be described as "a word by itself” and "a word in different environments." 
There are distinctly two levels of meaning ambiguity: (1) a so-called sub- 
ject matter ambiguity which fits into the category of "a word by itself," 
and (2) an environmental ambiguity, i.e., "a word in different environments." 

Example of the first type of ambiguity : 

AKT = 1 . act 

2. legal deed (law) 

3. convocation (education) 

OBRAZOVANIE = 1. education (education) 

2. formation (technical subject matter) 

Board = 1 . piece of wood 

2 . food (household arrangement) 

3. stage (theater) 

4. a council (political science) 

5. an action, as in "to board a train" 

The subject-matter ambiguity can be solved sometimes through the 
environment; for instance, if we encounter the word obrazovanie with modi- 
fiers like Kristallicheskoe (crystal), Kislotnoe (acid), etc., there is no 
doubt that the meaning of this word is formation . Outside of environmental 
influences , we have to depend on the subject matter of the article or book 
to be translated, i.e., microglossary, and use that as a cue for selection. 

We have to expect on the level of subject matter meaning ambiguity a certain 
degree q^ppp0.^^»yRete«a-2e04/(M/46h ©li^I>R®4-O®O48R0€»28®(miiO(^-3iand . 
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The second level of meaning ambiguity, the environmental ambiguity, 
is subject to meaning categorization or classes . 

We shall describe the method of arriving at these classes, as well as 
some class definitions, through the examination of environment relationships 
in Preposition - Nominal sequences. 

Prepositions by their meaning connotations can be divided into a 
variety of groups. We shall list some of them here: 


(1) Prepositions of time 






Simultaneous/Static 

(when) 

Earlier 

("before when") 

Later 

("after when") 


v, za, na, po, 
pri , s, sredi 

do, k* za, 
pered 

ot, po, s, 
cherez 

(2) 

Prepositions of space and location 








Where 

Where to 

Where from 


bliz, v, vne, za, 
na, nad, mezhdu 
(mezh) , pered, po, 
pod, pri, sredi, u 

do, v, k, za, na, 
o (ob) , pod, 
skvoz ' , cherez 

iz , iz-za, iz-pod, 
at , s 

(3) Prepositions of cause 










For whom, for what, 

why, etc. 




za, iz , iz-za, ot 

, PO, s 



(4) Prepositions of goal 




dlia, do, v, za, k, na, po, radl 
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(5) Part of the whole 



(6) Exchange or replacement 


za, vmesto 


(7) What is it made from 



Now let us consider the category of space or location connotation. 
We have already divided this category into three sub-categories: (1) The 
first sub-category implies the specific position of something, generally 
recognized by yielding an answer to the question, "Where?" It implies a 
point of location. (2) The second sub-category implies the concept of 
something proceeding towards a certain location, generally recognized by 
yielding the apswer to the question, "Where to?” It implies a poipt of 
destination. (3) The third sub-category expresses the idea of something 
coming from a certain location, generally recognized by yielding the ans- 
wer to the question, "From where?" It implies a point of origin. 

These sub-categories in turn can be divided further by analyzing 
specific prepositions. 

The prepositions js, iz , and iz-pod , all belong:. tp' the "From ; 
where? or point of origin class . They differ in their semantic content 
on an individual basis. When j3 is used, it designates either the place 
from which the object is removed by some agent, or the place from which 

an object capable of locomotion is leaving. This last instance usually 
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involves geographic locations in connection with persons or modes of trans 
portation of persons. When iz is used, it designates an object leaving by 
any means any location that has an egress, or the emergence of an object 
from another object. When iz-pod is used, it designates an emergence in 
any manner from under something on the part of an object. 

Examples; s Kavkaza, s gory, s sobraniia, etc. 

iz goroda, iz derevni, iz avtomobiia, etc. 
iz-pod kamnia, iz-pod stola, iz-pod knigi , etc. 

The prepositions do, k, cherez , all belonging to "Where to?" or 
point of destination sub-category, again differ in their meaning content 
on an individual basis. When do is used, it designates the direction of 
movement with the definite connotation of limitation or boundary. When K 
is used, it agqin designates the direction of the movement, but its defi- 
nite connotation is to achieve only proximity to the destination. When 
" hereZ ls used> lt: designates a penetrating movement through some medium, 
usually with some difficulty attached to it, and it also designates a 
movement of directly surmounting a difficult medium. 

Examples : do goroda, do Washingtona, etc. 

K beregu, K gorodu, K reke, etc. 
cherez les , cherez bar'er, etc. 

The prepositions u, bliz ’ , and pri, all belonging to the "Where?" 
or specific location sub-category, differ on an individual basis in their 
semantic content. When pri is used, it designates that one object is ad- 
joining another one. On the other hand, u indicates immediate closeness 
of objects; bllz^, in turn, indicates only closeness of objects. 

Examples : u reki , u berega , 

bliz’ goroda, pri stantsiii, etc. 

In most of these instances, the translation of the prepositions is 
at variance with their literal meaning (1:1 equivalent). 
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PREPOSITIONS 

LITERAL 

SPECIAL 



MEANING 

MEANING 

"FROM WHERE" CLASS 

s 

with 

from 


iz 

from 

from 


iz-pod 

from under 

from under 

"TO where" class 

do 

to 

to 


k 

to 

toward 


cherez 

through 

through/over 

"WHERE" CLASS 

u 

at 

by 


bliz ' 

near 

near 


pri 

at 

at 


All these categories/classes in turn, have to be divided again into 
smaller groups. For example: iz-pod in relation to location-objects does 
definitely mean f rom-under , but with location-cities the meaning of it 
becomes from the vicinity of . 

Example : iz-pod stola = from under the table 

12 "Po d Washingtona = from the vicinity of Washington 

The prepositions so far have been analyzed for their special or 
locational relationships. The same prepositions can also be analyzed with 
the view of other semantic criteria. 

For example the preposition iz with the connotation of selection 
will in some instances keep the translation from , but in the environment of 

a) before plural pronouns: nikh, vsekh, nas , tekh, etc. 

b) before numerals: dvukh, trekh, etc. 

c) before collective nominals like: chlenov, predstavibelej , 

iuristov, etc. 

will become the preposition of selection, that is, one of many or part of 
the whole with the translation of. 
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Example: luchshij _iz vsekh = best of all 

Komitet iz predstavltelej = committee of representatives 
Let us follow through the analysis of the same prepositions with the 
view of a time relation concept. In this case we will find that three sub- 
categories become apparent: (1) The first one implies that an action or 
state of being occurs after a fixed time span. These prepositions are: ot, 
— er9 3 £2' (2) The second sub-category connotes that an action or 

state of being occurs before a fixed time span. These prepositions are: do, 
(3) The third one infers that the action or state of being 
occurs during a fixed time span. These prepositions are: sredi, pc., v, s, 

— ’ — ’ ££i- Ther efore, we can now draw an analogy with the three pre- 
viously examined sub-classes and can call these time sub-categories "after 
when," "before when," and "when." 

It becomes apparent at this time that some prepositions occurring in 
these new sub-categories have participated in the previous ones. 

Of special interest in analyzing these prepositions are some that 
coincide with the sub-classes which were previously established: 



PREPOSITIONS 

LITERAL 

MEANING 

WHERE 

CLASS 

WHEN 

CLASS 

FROM-AFTER 

s 

with 

from 

■ ■ 

since 

TO-BEFORE 

K 

to 

toward 

toward 

WHERE-WHEN 

pri 

at 

at 

during 


For large-scale translation it is necessary not only to apply these 
larger categories and their sub- and sub-sub-categories, but also to analyze 

them in terms of each other in order to establish the similarities, as well 
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as conflicts, and then establish final definite categories. This task has 
been accomplished in the Unified Transfer System, and the precise description 
of each category will be included in the projected Unified Transfer System 
Manual. In this report, due to the limitations of time and the size of the 
report, we shall limit ourselves to the method of arriving at the categories, 
rather than categorization itself. 

Let us continue the analysis of the three prepositions s, k, and pri, 
limiting ourselves to "where" and "when" categories. 

From initial inspection it would appear necessary to assign at least 
ten meaning classes, one for each concept of the three prepositions. It 
becomes apparent that the number of meaning classes then would increase 
geometrically with the increasing number of preposition-participants and/or 
the introduction of new categories. We, therefore, begin to search for means 
of reducing the progression. The cues for this reduction come from two 
basic sources, and we can estimate that morphology provides about 70 % of 

the cues by imposing the case restrictions, and the usage of language should 
provide the remaining 30 %. 

In establishing meaning classes, we can combine several concepts into 
a single class, as long as there can be no conflict at the morphological or 
usage levels. The idea is to bring together the three elements of transfer 
(function, form, and meaning) and reduce the number of code patterns to a 
minimum. Thus, the next step would be to establish what part morphology 
plays in reducing the number of meaning classes in the sample prepositions. 

This can be illustrated in the following table. 
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PREPOSITION 

CLASS 

Genitive 

Dative 

CASES 
Ac cus a -M vc 



s 

1 

from 


for 

with 

Locative 


2 

from 






3 

since 





It 

1 

to 






2 

toward 






3 

toward 





pri 

1 







2 





at 







at 

>— ,, 

o 





during 


Note: 1 = literal; 2 = "where" case; 3 = "when" case 


From the table above we can see that for the preposition s only one 
morphological category is affected; therefore, we need not assign these 
particular meaning classes for the accusative and instrumental cases and 

thus achieve a reduction from the original possible ten to six meaning 
classes . 

Next we note that the meaning classes for each preposition of the 
sample belong into separate cases, and this would permit us to assign only 
two meaning classes, a "where" and a "when" class, coded up as entries in 
the proper case for the respective preposition. 

If we now examine these sample prepositions further, we will find 
that apparently ip the case of the preposition s_ we would not be able to 
reduce the number of meaning classes any further by virtue of the fact that 
the "where" translation differs from the "when" translation. On the other 
hand, K would not present any problem since both translations are identical 
while pri again presents us with separate translations for the "where" and 

when" cases, but the identical translation of the literal meaning and the 
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"where" case would permit us to eliminate the "where" class in this Instance. 
Sometimes, when the translation of prepositions with the same meaning defi- 
nition changes several times , we found it practical to give the preposition 
a so-called zero translation and attach the translation of the preposition 
to the nominals . Thus, we again achieve a reduction in the number of meaning 
classes. The same type of classification is applied to the remaining mem- 
bers of the prepositional phrase (nominals, modifiers) and the classes are 
assigned within the boundaries of function-form criteria. The definitions 
of meaning have to be used carefully and discretely, since the criterion 
of time element is not necessarily derived from strict time terms, but can 
be arrived at in combinations of prepositions with nominals of action. 

Example : pri okislenii = during the oxidation 

pri rabote = during the work 
pri issledovanii = during the investigation 

When assigning these classes to nouns it becomes apparent that these 
concepts are not rigid rules but are the result of subjective judgment. It 
is impossible, for instance, to say that all nouns of action in the locative 
case will yield the same translation during for the preposition pri. There 
are vagaries of usage which defy any definition. For instance, if z helanii , 
a noun of action in the locative (desire, wish) occurs with the preposition 
pri, the translation of the preposition changes to i_f and the meaning conno- 
tation from time to condition. 

Example : pri zhelanii = if desired 

A final consideration in assigning meaning classes must be an ex- 
pression of the physical location of the participating members (words), 
whether they must be immediately adjacent or whether they can be separated 
by non-participating words, i.e., whether the preposition-nominal relation- 
ship is dependent on their immediately adjacent physical location. 


Approved For Release 2004/01/15 : CIA-RDP64-00046R000200030003-3 



Approved For Release 2004/01/15 : CIA-RDP64-00046R000200030003-3 


-29 


Example : 

pri tshchatel 'nom issledovanil = during careful investigation 
pri issledovanil = during investigation 

versus: 

na drugoi den6 = next day 
na den6 = for a day 

Thus, the indication of the boundary, known in the system as boundary 
indicator or item count, is introduced together with some of the meaning 
classes. If no boundary is necessary, this indicator is coded as zero, 
otherwise it corresponds to the number of participating items, e.g., four 
participant-members (words) require digit 4 as a boundary indicator. 

Thus, after complete analysis, the code pattern for preposition with: 

a) one literal translation 

b) P + N = Adverb 


would look as follows: 

Pattern #1 = 16-0112-200 = 0 translation (adv. class) 

-0111-200 = literal meaning (1: 1 equivalent) 

This brings together the Function-meaning-Form transfer categories. 
Example : 

Preposition bez Nominals: interesa 

with literal meaning somnenlia, etc. 

without 

pol 'zy 
nuzhdy, etc. 


bez 

interesa 

= without 

interest 

bez 

sonmeniia = without 

doubt 

bez 

pol 'zy 

= useless 


bez 

nuzhdy 

= needless 
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V. LOCATION, ARRANGEMENT OR SYNTAX 

The next step in translation, once the function— form— meaning transfer has 
been achieved in the code, is the transfer of structure, which we could visualize 
as consisting of two sub-transfers: 

(1) actual structure transfer; 

(2) pure relocation of items within the 
structural boundaries. 

The structure sub— transfer is the division of a total sentence 1 into 
sentences and/or clauses; clauses into blocks; and blocks into phrases. 

The first phase of this sub-transfer is the identification of all 
punctuation marks within the sentence in regard to their meaning and function 
within the sentence . 

For example, let us examine the possible meaning and functions of the 
semi-colon (;). 

Its position: 

(1) between independent "sentences," combinations of which 
form the total sentence without use of conjunctions. 

These "sentences" can have commas inside themselves. 

(2) between independent sentences, which are combined into 
a total sentence by means of 

(a) conjunctions no, odnako , vse zhe, tem ne menee, etc 

(b) conjunctions JL, da 

(3) between phrase-type homogeneous members of the sentence, 
specifically where these "phrases" have modifiers or 
modifier-groups separated by commas. 

(4) between several subordinate clauses with one main clause 
present in the sentence; in that case, however, it would 


not be followed by conjunctions . 
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(5) between "sentences" which consist of main clause and 
subordinate clauses, i.e., independent "sentences". 

(6) between enumeration or recapitulation. 

If we examine the above we can see that with the exception of (3) and (6) , 
the semi-colon (;) is always a division mark between "sentences" and can be, 
upon recognition on the basis of its function code (71) , considered as a stop 
signal between "sentences" in structural analysis. 

For identification of the case in point (6), we have to locate the colon. 
Then the situation would be: 

xxxxxx : xxx ; xxxx; xxx; xxxxx. Note : x = word 

"sentence" sentences or phrases 

For identification of the case in point (3) , the situation is either 
similar to point (6) (e.g., we will locate the colon), or if the colon is absent, 
we are dealing with homogeneous phrases which can be treated in the same fashion 
as point (1) . 

Therefore, in the Unified Transfer System the semicolon is accepted as 
a stop signal for the division of a sentence into clauses or "sentences.” 

The same process is applied to other punctuation marks until we divide 
the sentence into "sentences," which then in turn have to be divided into 
introduction, subject, predicate and final blocks, in the order of their 
occurrence in a sentence. Then these blocks, which are found to be present, 
are rearranged into a model structure of introduct ion- subject -predicat e- final 
blocks . 

For the sake of the discussion, let us consider the identification of 
the subject. 

A subject can be: 

(1) Any part of speech in the nominative 

(2) Combination of words (cluster) with the connotation of a 
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(3) Numerical combinations with a precise or approximate definition 
of objects, like: dva priiatelia, nesko!6ko chisel ; 
minogo liudej , etc. 

Therefore, we could say that subject can be: 

SUBJECT EXAMPLES 


1 . 

Noun 

1 . 

Kolba stoiala na stole 

2. 

Adjective 

2. 

Serye ne podlezhat analizu 

3. 

Participle 

3. 

Spavshie prosnulis ' 

4. 

Numeral 

4. 

V vode rastvorilis' tolko dva 

5. 

Pronoun 

5. 

Ne poshel on domoj 

6. 

Verb in infinitive 

6. 

Pisat ' trudno 

7. 

Non- inflected word 

7. 

Gromkoe "ura" narushilo tishinu. 

8. 

Word cluster 

8. 

Brat s sestroj uchat'sia v universitete 

9. 

Numerical combination 

9. 

Dvesti studentov izuchaiut inostrannye 


iazyki . 

By examining the "sentence" for the presence of a noun in nominative any 
other part of speech in nominative, which can be then either subject itself or 
modifier to the subject in other case (i.e. , cluster subject), words like 
mnogo, malo , etc., and identifying it on the basis of their function- form code 
we can identify the subject and its or their modifiers and rearrange the total 
into a block sequence. 


Example: 


j. Predicated) 

Subject (1) 

final (3) 

Original block order 

Ne poshel 

on 

domoj 

(inverted word order) 

Subject (1) 

Predicated) 

final (3) 

Rearranged block order 

on 

ne poshel 

domoi 
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Whenever the blocks are larger than one word (which happens in. most of the 
cases), we check the rearrangement numeriq cbdes, which are attached to the 
translation. These numbers are assigned on the basis of the target language 
translations as well as the function of its source equivalent. 

Example: Dlia naibolee nagliadnogo predstavleniia = to get a clearer 

concept . 

The source language preposition dlia received a verbal translation, but 
retained a prepositional rearrangement code because of source language word 
function . 

A list of rearrangement codes and their definitions or equivalents follows: 
Syntax codes : 

001 Preposition 

002 Introductory words (if, that, which, what, how, why, as, since, etc.) 

003 Conjunctions (and, but, either, or, neither, nor, comma, colon, 
semicolon, etc.) 

004 Words like only, just, then, perhaps, maybe, therefore, however, 
almost, likewise, etc. 

005 Not 

006 It is possible, it is not possible, it is known, etc. 

007 Some, all, any, none, something, anything, any kind, nothing, etc. 

011 My, yours, his, etc. (possessive form) 

012 Numerals (one, two, etc., first, second, etc., few, many, much, more, 
most, last, etc.) 

013 Other 

014 Adjectives (including some pronouns) 

015 Nouns and pronouns (nominals) 

016 Myself, yourself, itself, etc. 

017 Participal modifiers 

021 Will, may, must, can, do, etc. 
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022 Have 

023 To be 

024 Seldom, often, really, verbally, continuous, ever, never (adverbs of time) 

025 Verbs and short forms of participles, adjectives 

026 Here, there, away, beyond, upstairs (place) 

027 Equally, rapidly, strangely, unequally, vastly, greatly, considerably, 
quite, btc. (manner) 

031 Early, late, later, soon, etc. (time) 

032 Period 


On the basis of the above table, the constituent parts of the system 
are as follows: 


1. Source language dictionary (in our case Russian Dictionary) 
Its format is: Russian word = Diet. Line # + Code Pattern #. 

Example: 


PREDS.TAVLENT I A = 14155* 66-212 


* arbitrary D. L. # 


2 . 

Its format is: 


Target language dictionary, in our case English dictionary. 
Dictionary Line # - English equivalent + rearrangement code. 


Example: 


14155 A 

concept 

015 

B 

concept 

015 

C 

of the performance 

015 

D 

the performance 

015 

E 

performance 

015 

F 

of the concept 

015 

G 

the concept 

015 

H 

the performances 

015 

J 

performances 

015 

K 

the performances 

015 

L 

performances 

015 
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3. Code patterns arranged by code pattern #. 
Example: 


66-212 

4115 

230 

000 


3116 

230 

000 


0111 

230 

421 


0111 

230 

401 


0111 

230 

400 


0111 

230 

021 


0111 

230 

001 


0111 

230 

000 


0111 

131 

401 


0111 

130 

400 


0111 

431 

401 


0111 

431 

400 


The code distribution in pattern: 

66-212 66 is functional part of speech 

212 actual # of the pattern 
line of 10 digit code, unified transfer code: 


1 

2 

3 

4 

5 

6 

1 

digit 

3 digits 

3 digits 

1 

digit 

1 

digit 

1 

digit 


1. Semantic (meaning) boundary indicator 

2. Semantic class 

3 . Form or morphology group 

4. Subject matter indicator (microglossary control) 

5. Preposition control 

6. Article control 

The system instructions outside of strict data preparation (text pre- 
paration, dictionary look-up, sorts, etc.) are divided as follows: 

1. Progression instructions (determination of working area) 

2. Selection: 

a) comparison of codes in patterns for the selection of form and meaning 

b) article, preposition selection 
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3. Verification or correction instructions for: 

a) changes in progression (larger or smaller string, single word 
selection etc.). 

b) Pronoun, conjunction selection 

4. Sentence recognition and division instructions 

5. Syntactic block recognition 

6. Rearrangement instructions 
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UTS 


A 

B 

V 

G 

D 

E 

ZH 

Z 

I 

I 

K 

L 

M 

N 

0 

P 

R 

S 

T 

U 

F 

KH 

TS 

CH 

SH 

SHCH 

rt 

y 

i 

V 

E 

IU 

IA 


A 

B 

V 

G 

D 

E 

ZH 

Z 

I 

J 

K 

L 

M 

N 

0 
P 
R 

5 
T 
U 
F 
KH 
TS 
CH 
SH 
SC 

1 

y 

6 

EH 

IU 

IA 
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SECTION TWO 

REPORT ON THE AUTOMATIC DECLENSION OF RUSSIAN NOUNS 
FOR THE UNIFIED TRANSFER SYSTEM (UTS) 


By 

Rudolf Loewenthal 
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Key t o A 

a or anim, 
aec , 
b 
C 

dat . 

F or fern, 
gen. 

i or inan , 
inst , 
loc . 

M or masc, 

N or neut , 

P or plur. 

S or sing. 


bbrevlations 

animate 

accusative 

both (animate or inanimate) 

class 

dative 

feminine 

genitive 

inanimate 

instrumental 

locative 

masculine 

neuter 

plural 

singular 

Numbers in declension patterns; 

1 - nominative 

2 - genitive 

3 - dative 

4 - accusative 

5 - instrumental 

6 - locative 
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I . INTRODUCTION 

Our transliteration from the Russian follows basically that of the Library 
of Congress. We omitted all diacritical marks because they cannot be re- 
produced on the computer. In addition, we made a few alterations; for 
reasons of economy we reduced the four-letter combination to two English 
letters corresponding to a single Russian letter. Thus, for practical 
considerations imposed upon us by the use of a computer, we introduced the 
following minor changes in the transliteration of the Library of Congress: 


Russian 

Library of 
Congress 

UTS trans- 
literation 

V/ 



H 

L 

j 


shch 

sc 

~b 

!! 

f 

6 

1 

6 

3 




e 

eh 


A key problem in the development of the UTS or of any other system is 
the preparation of a dictionary. The task is twofold; 

(1) The compilation of all the paradigmatic forms of nouns, 
adjectives, pronouns, verbs, and all verbal forms like 
participles, gerunds, etc., in the source language; and 

(2) the meanings in the target language. 

We are here concerned with the first; that is, the tedious, time consuming, 
and costly mechanical task of compiling a paradigmatic dictionary manually. 
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We realized the magnitude of that task, which would have kept us occupied 
for several months, even if a competent linguistic staff had been provided. 
The cost would have been high and our energy would have been drained away 
in organizing and revising such work. 

It has been of great help to us that statistical research had been done 
on some of our problems. Harry H. Josselson of Wayne University made 
a superb study on "The Russian Word Count,"* in which he analyzed almost 
47,000 Russian words of representative sources for their usage. He 
came to the conclusion that only 4.4% of the words are non-inf lected, 
i.e., adverbs, prepositions, conjuctions, and participles. 95.6% of 
all the words are inflected; that is, they are either declined or con- 
jugated . 


The findings are as follows: 

Group 1: 55.1% noun, adjective, adjective used as noun, pro- 

noun, and numerals characterized by declension 
(declensions: 69%) 


Group 2: 26.6% 


verb 


Group 3 : 13 .9% 

Total 95.6% 


verb derived forms of participle, participle used 
as adjective, participle used as noun, and gerund 


After we have established the fact that 95,6% of Russian words in average 
texts are inflected, let us examine the details of the problem. The 
Russian language has a total of 137 suffixes for declensions and con- 
jugations, including the suffix 0 which gives the machine negative 
information . 


* The Russian Word Count And Frequency Analysis Of Grammatical Categories 

Of Standard Literary Russian. Detroit, Wayne Univ. Press, 1953, p. 12 & 17. 
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On the average, each of these suffixes occurs close to 2,900 times In a 
paradigmatic dictionary, based on a 50,000 word dictionary. This should 
once for all dispel the concept of solving Machine Translation with the 
help of a split dictionary. The ambiguity in the stem-suffix relation- 
ship would simply be staggering. As an example in which it is impossible 
to identify different parts of speech or grammatical categories, let us 


examine the stem smotr- 

■ (inspection (noun); 

to look (verb)) 

noun 

failure : 

-u noun stem 

(dative; to inspection) 



-u verb stem 

(non-existent) 

verb 

failure : 

-iu noun suffix 

(non-existent) 



-iu verb suffix 

(I am looking) 

We calculated that approximately one in seven stems (one sixth), or 

almost 17%, is 

subject 

to such ambiguity on 

the basis of the 60,000' 


word dictionary by Miller. 

The magnitude of the process of compiling a paradigmatic dictionary 
can be gauged with fair accuracy. Noun or adjective stems have usually 
ten paradigms, while verbs have 37 paradigms, Russian stems average 25 
paradigms, including 0. Thus for a dictionary of 50,000 terms, it would 
be necessary to compile a paradigmatic dictionary of some 1,250,000 
paradigmatic forms. We have found a solution to this problem by 
utilising an electronic computer. 

Traditionally, there are three declensions and two (or possibly three) 
conjugations . By substantially increasing the number of categories we 
were able to eliminate 95% of the irregularities, thus leaving us with 
less than 5% of the words (approx. 2,500) to be declined or conjugated 
manually. For use on the computer, we have assigned each of our declension 
or conjugation categories a code number which we call a "declension code 
pattern. In accordance with the code pattern numbers, the computer will 
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select appropriate suffixes for every stem entered and compile a com- 
plete set of paradigms for every word desired.* We can increase our 
dictionary by a minimum of 5,000 words or 135,000 paradigmatic forms per 
month and provide them with all the codes necessary for translation. 


* We are using stems with a maximum of invariable letters . 
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II. DECLENSION OF NOUNS 

* A> Gen e r al 

For machine purposes, we have established 11 declension groups with 19 
sub-groups, or altogether 30 classes of Russian nouns. Only two of the neuter 
groups are quite regular and have no sub-groups (see Chart 1 at the end of this 
section., nos. 3 and 11). The frequent masculine (nos. 1 and 2) and neuter (nos. 3 
and 4) declensions have been placed next to each other at the beginning of the 
chart because of their many similarities. The minor groups (masc.; nos. 8, 9, 
and 10; neut.: no. 11) appear at the end of the list. In the feminine gender 
(nos. 5, 6, and 7; incl. two small masc. sub-groups: nos. 5c and 6d), the 
distribution is more even. The main declension patterns (nos. 1-7) with their sub- 
groups apply to between 80% and 85% of all the nouns, while the minor patterns 
(nos. 8-11) provide for about 10% of the remaining nouns. Approximately 5% of 
the nouns are quite irregular and will have to be prepared by hand. 

There are only two variant forms for the dative, instrumental, and locative 
plural of all nouns. In contrast, the endings of the other cases have numerous 
variations . 


B. Undeclinable Nouns * 

Certain types of nouns cannot be declined at all. They include: 

(1) Foreign names or loan words ending in -j. and - u/- iu (alibi, Peru, 
parveniu ) . 

(2) European loan words ending in -o and -e/-_eh (kakoa , kofe , aloeh ) . 

(3) Ukrainian surnames ending in -ko. 

(4) Some foreign loan words or names ending in a stressed -a (a mplua , 
Diuma ) . 

(5) Feminine nouns (applying to women) ending in a consonant: madam, miss 
missis , f rejlejn , mademuazelB ; likewise alma mater and bersez. 


* Undeclinable nouns can be traced in any standard Russian 

mat ion ^4/QJ/^ 

Gramma jre russe (Paris, Les Langues du Monde, 1953, p. 70-71) 


The above infor- 
. Unbegaun in his 
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(6) Surnames ending in a consonant, even if they are of Russian or 
Ukrainian origin ( volk , Gogol6 ) . 

(7) Russian surnames ending in - ykh /- ikh , - ogo , and - ago . 

(8) Musical notes (do, _re, mi, fa, _so, _la, _si) and the letters of the 
alphabet . 

C . Declension of Masculine Nouns 

There exist altogether 17 automatic declension patterns for the masculine 
gender; _i.e., two main groups and three minor groups; the remainder are sub-groups 
(see table 1). 


TABLE 1: SAMPLES OF 17 MASCULINE NOUN DECLENSION PATTERNS 


Number 

Number on 
Chart 

Number of 
Declension 

Pattern 

Sample Word 

Animate or 

In an imate 

1 

1 

la 

doklad 

i. 

2 

2 

lb 

ocherk 

i 

3 

3 

2a 

sara j 

i 

4 

4 

2b 

delitel6 

i 

5 

5 

2c 

put6 

i 

6 

6 

2d 

ehkipazh 

i. 

7 

7 

2e 

geroj 

a 

8 

22 

8a 

uchen ik 

a 

9 

23 

8b 

lev (2 stems) 

a 

10 

24 

9a 

kon6 

a 

11 

25 

9b 

uchitel6 

a 

12 

26 

TO a 

gorod 

i 

13 

27 

10b 

nozh 

i 

14 

28 

10c 

tovarisc 

a 

15 

29 

lOd 

doktor 

a 

16 

15 

5c 

muzhchina (fern, pattern) 

a 

17 

Appealed For 

Release 2004/01/15 

: CIA-B^44-O0^Rpa2m3OOO. 

3-3 a 
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Masculine nouns display the greatest variety of declension patterns, 
they account for 17 or more than one-half of the total. 

M-l This very large class of words pertains to inanimate objects. 


Table 2 - Class 1 


r 

S 

P 

_ _ 


la(i) 

lb(i) 

la(i) 

lb(i) 

i 

0 

0 

y 

i 

2 

a 

ov 

ov 

ov 

3 

u 

u 

am 

am 

4 

0 

0 

y 

i 

5 

om 

om 

ami 

ami 

6 

e 

e 

akh 

akh 


Samples: la: vint , narod , doklad , pantograf , son (2 stems) 


lb: ocherk , potok , ugolek (2 stems), veter (2 stems) 
Sub-group lb differs from la only in the ending -i 
(instead of in the nom. and adc, plur.* 


* There are many masc. nouns ending in a compound consonant. Their 
gen. plur, usually ends in - e j (lOb-c). 


46 

actually 
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M-2 This major group has the following endings; 


Table 3 - Class 2 


c 

S 

P 


2a(i) 

2b (i) 

2c(i) 

2d(i) 

'26(a) 

2a( i) 

2b( i) 

2 c (i) 

2:d(i) 

2e(a) 

1 

j 

6 

6 

0 

' 3 

i 

i 

i 

i 

i 

2 

ia 

ia 

i 

a 

ia 

ev 

e j 

e j 

ej 

ej 

3 

iu 

iu 

i 

i u 

iu 

iam 

iam 

iam 

am 

iam 

4 

j 

6 

6 

0 

ia 

i 

i 

i 

i 

i 

5 

em 

em 

em 

em 

em 

iami 

iami 

iami 

ami 

iami 

6 

e 

e 

i 

e 

e 

iakh 

iakh 

iakh 

akh 

iakh 


Samples , 2a: 

2b; 

2c; 

2d; 

2e; 


slucha j , saraj , kra j . bo j , alum ini j . (sing, only), 
cha j (sing, only) 

delite!6 , korend , den 6 (2 stems), kamen6, slovar6 
(rare): put6 

(extremely rare): ehkipazh 

gero j , vorobe j , muravej , svidite!6 


Sub-group 2e is the only class of animate nouns in this major declension 
group and corresponds directly to sub-group 2a. 

M-8 This minor class includes only animate nouns: 


Table 4 - Class 8 


c 

s 

p 


8a(a) 

8b(a) 

8a(a) 

8b(a) 

mm 

0 

0 

i 

y 

B 

a 

a 

ov 

ov 

B 

u 

u 

am 

am 


a 

a 

ov 

ov 

1 


om 

ami 

ami 

1 ■ 

e 

e 

akh 

akh 

IIS5S 



II W III 1 IWililiH 1 lililil lililil lililil F — 


Samples, 8a: letchik , starik , byk , brat , voron , jzvozchik . volk 


8b; lev (2 stems), ugol (2 stems), vol 
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Sub-group 8b differs from 8a only in the nom. plur., where the two-stem 
nouns carry the ending -y (instead of -i) . Masc. nouns of class 8a ending 
in the letters -k, -g, -kh, -ch, -sc , -zh, - sh carry the ending -i in the 
nom. plur. The inanimate nouns of classes 10b (very close to lb) and 10c 
have the -i^ ending also in the acc . plur. 


The endings of this minor group of animate nouns are as follows; 



Table 

5 

Class 

9 

c 

s 

1 

P 


9a (a) 

9b(a) 

j 

9a(a) 

9b(a) 

1 

6 

6 


i 

ia 

2 

ia 

ia 


ej 

ej 

3 

iu 

iu 


iam 

iam 

4 

ia 

ia 


ej 

ej 

5 

em 

em 


iaroi 

iami 

6 

e 

e 


iakh 

iakh 


Samples, 9a: kon6 , golubQ , spasitel6 , pisate!6, zhitelG, gost6 


9b: uchlte!6 , rukovodltel6 . 

The very small 9b class has the ending -la in the nom. plur. (instead 
of -i; cf. also classes 10a and lOd) . 
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M-10 This is one of the most ambiguous classes, containing animate as 



well as 

inanimate 

nouns . 

Table 

5 - 

Class 10 




c 


S 



p f 

10a(i) 

10b(i) 

10c (a) 

10d(a) 

10a (i) 

10b(i) 

10c(a) 

10d(a) 

1 

0 

0 

0 

0 

a 

i 

i 

a 

2 

a 

a 

a 

a 

ov 

ej 

ej 

ov 

3 

u 

u 

u 

u 

am 

am 

am 

am 

4 

0 

0 

a 

a 

a 

i 

ej 

ov 

5 

om 

om 

em 

om 

ami 

ami 

ami 

ami 

Q 

e 

e 

e 

e 

akh 

akh 

akh 

akh 


Samples : 10a: go rod , ostrov , porus , vecher 
10b: nozh 
10c : tovarisc 
lOd: doktor 

Differences between the various groups occur in the acc. and inst . sing., 
as well as in the nom., gen., and acc. plur. Note the unusual plur. ending 
-a in the nom. (10a and lOd) and acc. (10a) plur. 
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M-5c , M-6d , Two small snb-groups of masc . nouns are declined like fem. nouns 
and have been classified among them. Their endings are: 


Table 7 - Classes 5c and 6d 


c 

S 

p 

5c (a) 

6d(a) 

5c(a) 

6d(a) 

1 

a 

ia 

i 

i 

2 

i 

i 

ej 

ej 

3 

0 

e 

am 

iam 

4 

u 

iu 

ej 

ej 

5 

ej 

ej 

ami 

iami 

6 

e 

e 

akh 

iakh 


Samples , 5c : muzhchlna , dedushka , mal6chishka 
6d: diadia, sud6ia 
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D, Declension of Neuter Nouns 

For automatic declension, we consider all neuter nouns as inanimate. 
The word ditia, though animate, is so irregular that it requires manual 
declension. As a matter of fact, most of the irregular nouns, which have to 
be declined by hand, belong to the neuter nouns.* Other rare exceptions of 
neuter animate nouns we integrate into one of the masculine classes for our 
purposes . 

The six neuter declension patterns (classes 3, 4, and 11) are much 
more regular than the masculine ones. Actually, classes 3 and 11 have no 
sub-groups at all . 


Table 8: SAMPLES OF SIX NEUTER NOUN DECLENSION PATTERNS 


Number 

Number 

On Chart 

Number of 
Declension 
Pattern 

Sample Word 

Animate or 
Inanimate 

1 

8 

3 

mesto (no sub-group) 

i 

2 

9 

4a 

pole 

1 

3 

10 

4b 

dejstvie 

i 

4 

11 

4c 

lozhe 

i 

5 

12 

4d 

plat6e 

i 

6 

30 

] 

li 

ukho (2 stems , no sub-gro 

up) i 


*There are some twenty neuter nouns which are so irregular that they will 
have to be declined manually; e.g., vremia, vymia, ditia, imia, plamia, 
plemia , semia, stremia , temia , znamia . 
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N-3 Class 3 is one of two (class 11) which has no sub-groups. 


Table 9: CLASS 3 


c 

S 

P 

3 (i) 

3(1) 

1 

o 

a 

2 

a 

0 

3 

u 

am 

4 

o 

a 

5 

om 

ami 

6 

e 

akh 


Samples: mpsto , slovo , delo , ozero , selo 
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Class 4 Includes inanimate neuter nouns with four slightly different 
declension patterns. 

Table 10: CLASS 4 



S 


p 

c 

4a(i) 

4b(i) 

4c(i) 

4d(i) 

4a(i) 

4b(i) 

. 

4c(i) 

4d(i) 

1 

e 

e 

e 

e 

ia 

ia 

a 

ia 

2 

ia 

ia 

a 

ia 

ej 

j 

0 

ev 

3 

iu 

iu 

u 

iu 

iam 

iam 

am 

iam 

4 

e 

e 

e 

e 

ia 

ia 

a 

ia 

5 

em 

em 

em 

em 

iami 

iami 

ami 

iami 

6 

e 

i 

e 

e 

iakh 

iakh 

akh 

iakh 


Samples: 4a: pole , more 

4b: dejstvie 

4c: lozhe , zhil isce y uchillsce, s^rdtse ■ 

4d: plat6e 

N-ll Class 11 is one of two (class 3) which has no sub-groups 

Table 11: CLASS 11 


c 

S 

p 

H(i) 

ii(D 

1 

o 

i 

2 

a 

ej 

3 

u 

am 

4 

o 

i 

5 

om 

ami 

6 

e 

akh 
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® • Declension of Feminine Nouns 

There are three main classes or seven groups of feminine declension 
patterns for nouns. Like the neuter nouns, the feminine nouns are much 
more regular than those of the masculine gender.* The masculine sub-groups 
5c and 6d have a feminine declension type, but they have been discussed at 
the end of the section on masculine declensions. 


Table 12: SAMPLES OF SEVEN FEMININE NOUN DECLENSION PATTERNS 


! Number 

Number 

On Chart 

Number of 
Declension 
Pattern 

Sample Word 

Animate or 
Inanimate 

1 

13 

5a 

bukva 

b 

2 

14 

5b 

kniga 

i 

1 3 

16 

6a 

dolia 

b 

4 

17 

6b 

funktsiia 

i 

5 

18 

6c 

niania 

a 

6 

20 

7a 

chaste 

i 

7 

21 

7b 

loshad6 

a 


not wo r t hwh i n t n ° U * S * re u lrre S ular > but because of their small number it is 
not worthwhile to establish a special sub-group for them. They will have to 

be declined manually; e.g. , zmeia, gostGia, and other animate 

derevnia VtT th^ b^shnia, kolokol6nia . bojnia, 

iT ds in '-on ’ genitive plural end in -en~d kukhnia whichT 
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^he patterns of classes 5a and 6a apply to both animate and inanimate 
nouns. Class 5a includes both animate and inanimate nouns; 5b applies to 
inanimate nouns only, while 5c is a masculine sub-group of animate nouns 
with a feminine declension pattern (cf . 6d: both are described among the 
masculine nouns). 


Table 13; CLASS 5 




s 

p 

c 

fem. 

5 a (b) 

fem. 

5b(i) 

masc. 

5c (a) 

fem. 

5a (b) 

fem. 

5b(i) 

masc . 
5c(a) 

n 

a 

a 

a 

y 

i 

i 

2 

y 

i 

i 

0 

0 

ej 

3 

e 

e 

e 

am 

am 

am 

4 

u 

u 

iu 

y 

i 

ej 

5 

Oj 

oj/oiu 

ej 

ami 

ami 

ami 

6 

e 

e 

e 

akh 

akh 

akh 


Samples , 5a; 

5b: 

5c: 


bukva , z hena , trava 

kniga , tysiacha , devushka , ruchka (2 stems), 
palka (2 stems) 

muzhchina , dedushka, ma!6chishka 
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•FM3 Class 6a includes both animate and inhnimate nouns; 6b contains only 
inanimate and 6c animate nouns, while 6d is a masculine sub-group with a 

feminine declension pattern (cf . 5c; both are described among the masculine 
nouns ) . 


Table 14: CLASS 6 




S 

— 

c 

fern. 

6a(b) 

fern. 

6b(i) 

fem. 

6c(a) 

masc. 

6d(a) 

fern. 

6a (b) 

fem. 

6b(i) 

fem. 

6c(a) 

masc . 

6d(a) 

1 

ia 

ia 

ia 

ia 

i 

i 

i 

i 

2 

i 

i 

i 

i 

ej 

j 

ej 

ej 

3 

e 

i 

e 

e 

iam 

iam 

iam 

iam 

4 

iu 

iu 

iu 

iu 

i 

i 

ej 

ej 

5 

ej 

eiu 

e j 

ej 

iami 

iami 

iami 

iami 

6 



e 

i 

e 

e 

iakh 

iakh 

iakh 

iakh 


Samples , 6a: 

6b: 

6c: 

6d: 


barynia , svlnia , stat6ia (2 stems) 
ideia , istpriia , funktsila , armjia 
niania (2 stems) , pulla (2 stems) 
sudfiia, diadia 
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F-7 This class contains a group of inanimate and another for animate 
nouns . 

Table 15: CLASS 7 



S 

P 

c 

7a (i) 

7b (a) 

7a (i) 

7b (a) 






1 

6 

6 

i 

i 

2 

i 

i 

ej 

©j 

3 

i 

i 

lam 

iam 

4 

6 

6 

i 

ej 

5 

6iu 

6iu 

iami 

iami 

6 

i 

i 

iakh 

iakh 


Samples , 7a: 

7b: 


c hastG , gavan6 , s tep6 , artelS 
loshad6 
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F. Russian Suffixes in English Transliteration 

The morphology of Russian nouns is limited to 26 suffixes. These 
are arranged according to the number of letters in their English trans- 
llteration and alphabetically within those groups. They are preceded by 
the symbols 0 and 6 . 

Table 16: NUMBER OF NOUN SUFFIXES 


Group 

Number 

Number of 
English 
Letters 

Number of 
Suffixes 




1 

1 

9 

2 

2 

9 

3 

3 

6 

4 

4 

2 

Total 26 


The suffixes have been given numeric octal notations in our working 
chart ranging from 001 to 202 (table 17, col. 1 and table 18, col. 2). 

The noun stems have been divided into 30 classes ranging from 100 to 130*. 
In accordance with these code pattern numbers, the computer will select 
the appropriate suffixes for every stem entered and compile a complete set 
of paradigmatic forms for every word desired. 


* Code pattern no. 100 pertains to undeclinable nouns. The code pattern 
numbers 100 to 130 for the noun stems will have to be inserted manually. 
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Suffix 






Noun 

Classes 




001 

0 (0, 

100, 

101, 

102, 

106, 

108, 

111, 

113, 

114, 

122, 

123, 

126, 


zero) 

127, 

128, 

129 









002 

6 

104 , 

105, 

118, 

120, 

121, 

124, 

125 





0Q3 

a 

101 , 

102, 

106, 

108, 

111, 

113, 

114, 

115, 

122, 

123, 




126, 

127, 

128, 

129, 

130 







004 

e 

101, 

102, 

103, 

104, 

106, 

107, 

108, 

109, 

110, 

111, 




112, 

113, 

114, 

115, 

116, 

118, 

119, 

122, 

123, 

124, 

125, 



126, 

127, 

128, 

129, 

130 







005 

i 

102, 

103 , 

104, 

105, 

106, 

107, 

110, 

114, 

115, 

116, 

117, 



118, 

119, 

120, 

121 , 

122, 

124, 

127, 

128, 

130 



006 

3 

103 , 

107, 

110, 

117 








on 

o 

108, 

130 










012 

u 

101, 

102, 

106, 

108 , 

111 , 

113, 

114, 

115, 

122, 

123, 




126, 

127, 

128 , 

129, 

130 







014 

y 

101, 

113, 

123 









022 

am 

101, 

102, 

106, 

108 , 

111, 

113, 

114, 

115, 

122, 

123, 




126, 

127, 

128, 

129, 

130 







027 

ej 

104, 

105, 

106, 

109, 

115, 

116, 

118 , 

119, 

120, 

121, 

124 



125, 

127 , 

128, 

130 








050 

em 

103 , 

104, 

105, 

106, 

107, 

109, 

110, 

111, 

112, 

124, 

125, 



128 











033 

ev 

103, 

107, 

112 









034 

ia 

103 , 

104, 

107 , 

109, 

110, 

11?, 

116, 

U7, 

118 , 

119, 

124, 



125 











043 

iu 

103, 

104, 

107, 

109, 

110, 

112, 

116, 

117, 

118, 

119, 

124, 



125 











054 


113 











055 

om 

101, 

102, 

108, 

122, 

123, 

126, 

127, 

129, 

130 



056 

ov 

101, 

102, 

122 , 

123, 

126, 

129 
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Declension 



Class 

Suffix 

Noun Classes 

101 

6iu 

120, 121 

103 

akh 

101, 102, 106, 108, 111, 113, 114, 115, 122, 123, 



126, 127, 128, 129, 130 

104 

ami 

101, 102, 106, 108, 111, 113, 114, 115, 122, 123, 



126, 127, 128, 129, 130 

112 

eiu 

117 

123 

iam 

103, 104, 105, 107, 109, 110, 112, 116, 117, 118, 119, 120, 



121, 124, 125 

140 

oiu 

114 

201 

iakh 

103, 104, 105, 107, 109, 110, 112, 116, 117, 118, 119, 120, 



121, 124, 125 

202 

iami 

103, 104, 105, 107, 109, 110, 112, 116, 117, J.18, 119, 120, 



121, 124, 125 
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Hi 




Suffix Code Numbers 

for 

Noun 

Stems 

124 

002, 

004, 

005, 

027, 

030, 

034, 

043 , 

123, 

201, 

202 

125 

002, 

004, 

027, 

030, 

034, 

043, 

123, 

201 , 

202 


126 

001, 

003, 

004, 

012, 

022, 

055, 

056, 

103, 

104 


127 

001, 

003 , 

004, 

005, 

012, 

022, 

027, 

0 55 , 

103, 

104 

128 

001, 

003 , 

004, 

005, 

012, 

022, 

027, 

030, 

103 , 

104 

129 

001, 

003, 

004, 

012, 

022, 

055, 

056, 

103, 

104 


130 

003, 

004, 

005, 

Oil, 

012, 

022, 

027, 

055, 

103, 

104 
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I I I . SAMPLE DECLENS ION 

Based on the declension pattern which will be inserted manually, the 
computer will select the appropriate suffix codes and compile automatically 
a complete set of paradigmatic forms for every word desired. For instance, 
the stems of the two fem. nouns: 

knig- and tysiach - 

Table 19. Declension of fem. Nouns of Class 5b 


Case 

Suffix Code 

Suffix 

P-2 and stem 

001 

0 

S-l 

003 

a 

S-3 , 6 

004 

e 

S-2, P-1, 4 

012 

i 

S-4 

014 

u 

P-3 

022 

am 

S-5* 

054 

Oj 

P-6 

103 

akh 

P-5 

104 

ami 

S-5* 

112 

oiu 


* The machine will identify both forms of the inst . sing., 


the modern form knig-oj and the obsolescent form knig-oiu. 
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Table 20: SAMPLE DECLENSION OF A TWO-STEM NOUN 


Sing. 

Case 

Sing . 
Stem 

Suffix 

Code 

Number 

Suffix 

Plur . 

Case 

Plur . 

Stem 

Suffix 

Code 

Number 

Suffix 







■■■■■■■■■■HI 


2 

ukh- 

003 

a 

1, 4 

ush- 

005 

i 

6 

ukh- 

004 

e 

3 

ush- 

022 

am 

1, 4 

ukh- 

Oil 

o 

2 

ush- 

027 

ej 

3 

ukh- 

012 

i 

6 

ush- 

103 

akh 

5 

ukh- 

055 

om 

5 

ush- 

104 

ami 


In this particular case, the stems are equally divided between the singular 
and the plural. It is, however, quite irrelevant for the system if the two stems 
are used for singular and/or plural. 


IV. TWO-STEM NOUNS 

We encounter two-stem nouns only in combination with the following five 
suffixes : 


Table 21: SUFFIXES OF TWO-STEM NOUNS 


Suffix 


Suffix 

Code 


0 

6 

a 

o 

ia 


001 

002 

003 

011 

034 


On the basis of the two stems the machine produces an unambiguous and 
correct declension. 

Three-stem nouns belong to the 5% irregular nouns which will foe declined 

manually. We are only interested in providing for the declension of the nouns 
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V. CONCLUDING REMARK 

This system of dictionary preparation was developed especially for 
the Unified Transfer System. It applies, however, equally to the compilation 
of any other dictionary. The present report is limited to Russian nouns, 
but a corresponding analysis has been drafted for the declension of remaining 
nominals (numerals, etc.), as well as all modifiers (adjectives, numerals, 
long forms of participles, etc.), and the conjugation of verbs. The three 
reports will eventually be incorporated in a working manual . 
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co 

X 

u 

1 

© 

some foreign loan words and certain foreign names 

very large group; for animate nouns, see 8b and lOa-b 

close to types 8a, lOa-b 

corresponds to 2e (anim.) 

very close to 10b (inan.) 

corresponds to 9a (anim.) 


corresponds to 2a (anim.) 

no sub-group (see also 11) 








irreg . fem. nouns like zmeia and gost6ia to be declined manually 


3 

0 

01 

H 

d 

© 

© 

(0 

© 

a 

>* 

p 

a 

0 

H 

W 

fl 

© 

H 

J 

© 

d 

a 

3 

3, 

0 

1 

gen. and acc. plur. are identical 

H 

corresponds to la (inan.) 

corresponds to 2b (inan.) 

X3 

o 

ft 

o 

cfl 

ft 

d 

© 

© 

CO 



close to la-b; corresponds to lOd (anim.) 

close to xa-D; see also 8a-b (anim.) 

§ 

c 

ft 

43 

03 

© 

© 

01 

no sub-group (see also type 3) 

CO 

© 

1 

CQ 

1 

3 

x 

0 

■a 

a) 

a 

© 

43 

O 

0 

•n 

d 

a 

d 

© 

to 

1 

to 

ft 

a 

43 

N 

d 

ft 

ft 

X 

43 

© 

•n 

o 

u 

CD 
bt 1 

0 

+» 

i 

© 

© 

0 © 

ft XJ 

© 

43 

N 

0 

© 
t O 
ft 
d 

ft 

d 

> 

X 

s 

d 

be 

3 

d 

fl 

ft 

43 

a 

N 

a 

d 

ft 

CD 

fl 

> 

CO 

d 

CO 

ft 

X 

c 

5 

d 

ft T 

fl 

d x 

•h ; 

c 

to 

ft 

CO 

s 

o 

to 

•a 

d 

43 

CO 

o 



to 

ft 

© 

ft 

•h 

43 

3 

X) 

0 

A 

o 

bfl 

o 

CO 

ft 

a 

d 

si > 

o o 

fl ft 

fl 

O 

X 

0 

xs 

CQ 

I 

ft 

<N 

O 

s 

3 

ft 

d 

fl 

ft 

B 

■ 

5 

d 

X 

d 

X 

d 

•H 

a 

d 

■H 

a 

d 

•H 

1 

X 

a 


a s 

d c 

a 

d 

a 

d 

•H 

a 

d 



a 

d 

ft 

a 

d 

■h 

11 

11 

a 

d 

a 

d 

a 1 

d ft 

j2 

d 

•h 

43 

X 

d 


3 

d 

a 

d 


■ 

1 


1 


i 

1 

1 

ft 

§ 

1 1 

ft *r 

1 



i 

1 

-H 

1 

■h 

1 

ft 'P 

i j 

1 

•H 

•H 

1 

ft 

•H 

1 

g ! 

a 

d 

ft 

•h 

1 


I 

ft 

a 

d 

i 

■ 

fl 

B 


B 

B 



B 

BE 

fl 

B 

B 

■ 

B 

B 

B 

BE 

B 

•o 

© 

B 

IB 


B 


B 



■ 

s 



I 

i 

1 


I 

11 

1 

g 

■H 

g 


B 


9 

IE 

B 

J 




1 


1 


IB 

1 

i 

B 

B 

B 

B 

i 

| 

B 

BE 

B 


a 



B 


BE 

B 


> 

0 

> <-> 

o © 

•<~3 

© 




1 

B 

■ 


B 



B 




B 

B 

B 

B 


B 



BE 

B 

B 

II 

IB 

B 



B 

B 


E 

■ 

B 

B 


B 

B 

i 



BE 

B 


B 

© 

fl 

B 

B 

BE 


B 

ij 

II 

B 

B 


B 

■ 

1 

■ 

1 

i 

a 

© 

I 

1 

a 

© 

l 

i 

a a 

© © 

a 

© 

a 

© 

1 

1 

© 


1 

IE 


1 

ii 

11 


1 



1 



B 

a 

B 


a 

B 

a 

B 

1 


B 

B 

■ 

B 

B 

B 

ii 

1 

1 

d 

d 

d ft 

d 

ts s 

d 

d 

1 


1 

B 

B 

1 

3 

B 

B 



B 



B 

■ 

B 

B 


BE 

1 

B 

3 

|1 


BE 

IB 

B 

1 

lQj 

■ 

B 

B 

d 

3 

fl 


i 

B 

1 


d 

1 

■ 

B 

fl 


BE 

1 

fl 

d 

IB 

i 

BE 

IB 

B 

1 


■ 

B 

1 

B 

I 

B 

B 

B 

B 

a 


B 

1 


B 

fl 

d 

•H 

BE 

B 

B 

ts. 

a to 

to 

cs. ts 

IB 

B 

B 

Animate 

or In- 
animate 

43 

ft 

O 

x> 

■ 
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c 

d 

a 


i 

I 

I 

s' 

d 

a 

ft 

9 S 

a a 

a 

d 

fl 

ft 

9 

fl 

ft 

43 

ft 

0 

43 

c 

d 

C 

a 

a 

d 

43 

ft 

0 

43 

fl 

9 

ft 

a a 

9 9 

a 

d 

fl 

•h 

a 

5 

a 

ft 

s 

i 

§ 

a* 

•h 

a 

d 

fl 

d 

fl 

i 

§ 

1 

1 

Declen- 

sion 

Type 

© 

§ 

a 

I 

43 

d 

<N 

i 

i 

I 

| 


H 


x> 

B 

1 


d 

to 

43 

CO 

O XJ 

to to 


1 

d 

oo oc 

d 

(31 

43 

03 

d 43 

O O 

o 

o 

1 

1 

Declen- 

sion 

Pattern 

Number 

o 

o 

o 

N 

O 

<n 

o 

i 

o 

rH 

to 

o 

t- 

o 

00 

o 

ft 

1 

B 


i 

1 

1 


1 



1 

M 

N cs 

c5 

ft 

in 

IN 

to 

IN <N 

ft r- 

00 

(N 

1 

I 

a 

© 

G 

£ 

m 

j 

a 

d 

> 

© 

d 

a 

m 

1 

0 

01 
d 

a 

0 

01 

d 

a 

0 

01 

d 

a 


0 

01 
d 

a 

ft 

3 

© 

fl 

+> -p 

3 3 

© © 

a a 

H 

ft 

3 

© 

C 

a 

,© 

ft 

i 

ft 


1 

1 

• o 

a m 

■s i 

a 

© 

1 

o © 

co cr 

§ 1 
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CO 

d 

a 

0 

CO 

1 

o o 

CO CO 

a 1 

O 

CQ 

g 

CQ 

i 

1 
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SECTION THREE 

REPORT ON COMPUTER IMPLEMENTATION 
OF THE UNIFIED TRANSFER SYSTEM 


By 

B. D. Blickstein 
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The basic flow diagram, Figure 1 on the next page, traces the basic 
functions which the computer must follow, and shows the necessary magnetic 
tape configuration. Also shown on this chart is an index to the tapes, 
showing the processes in which each tape is involved. 

The flow chart is divided into the following computer program steps: 

1 . Text Preparation 

The entry to this box is the raw text, prepared by either key- 
punching from the Russian or by a character- scanning device. The function of 
this program is to convert the text to a form which the machine may more easily 
accept. At the same time, Romanized expressions will be extracted and saved 
for later re-entry into the system. At this point, a transliteration of the 
text can be produced. 

2 . Alpha Sort 

The sequenced and prepared text is now sorted into dictionary order. 
The original text sequence numbers are retained . 

3 . Dictionary Search 

The sorted, text is matched against the dictionary tape. For each 
text entry for which a dictionary match exists, a record will be written on 
tape D, consisting of the appropriate pattern number and the set of English 
meanings, still retaining the text sequence number. For each text entry which 
has no match, a dummy "word missing" record will be written, and the Russian 
word written on the "missing entries tape D1 for subsequent printing. 

4 . Sequence Sort 

Tape D is now sorted back into text sequence . At the end of the 
sort, a split of the tape D record will occur, creating two tapes, E and El; 
Tape E contains only pattern numbers, and tape El the corresponding sets of 
English meanings. 
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5 . Unified Transfer 

The basic code matching algorithms are performed here . Blocks are 
recognized, and the proper meaning selections are made, the output is the 
sequenced selections tape. The computer considerations of this section will be 
treated later at some length. 

6 . English Extraction 

The selections tape is used to select the proper English meaning 
from the English tape at this point. The output is an English text with 
certain block marks present . 

7 . Syntactic Ordering 

Re-arrangement of the syntactic blocks is performed here; at the 
same time, the Romanized expressions are merged back into the text, and a 
final translation tape, suitable for printing, is produced. 

Some discussion of the matching algorithms is appropriate here; the first 
part of the process is shown in Figure 2 on the next page. This involves the 
identification of phrases by means of the parts-of-speech code numbers, which 
we shall refer to as progression numbers. Let PR(j) be the progression number 
associated with the text sequence. As the translation progresses, suppose 

all phrases through the (j-l)^ h are strung, and we thus wish to find the 
boundaries of the phrase beginning with this j word. The flow chart (beginning 
at step t0) traces the entire technique for identifying the phrase. At the 
conclusion of this process, the phrase is bounded, and the code matching on the 
actual dictionary patterns may commence. It can be seen that this algorithm 
involves little else than a few arithmetic counts and comparisons, and certainly 
no analysis of the source language is performed. This example serves well to 
point up the essential philosophy of the Unified Transfer technique; the computer 
is used for the things it does best, namely arithmetic and logic, while the 

analysis is done in advance by means of the dictionary. We do not ask the computer 
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to pome to conclusions about form; we merely ask it to choose between various 
possible forms on a basis of simple logical rules. In this way, the full power 
of the machine is used in the most efficient manner. 

The subsequent code matching process is also designed with this same 
philosophy. The only question asked is basically an "equal-or-unequal" choice; 
blocking for syntactic re-arrangement is similarly well suited to this type of 
treatment. In no case does the machine ever "know" about syfltaix or meaning; it 
only follows completely abstract rules for operating on certain numerical 
sequences . 
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